AITopics | placement strategy

Collaborating Authors

placement strategy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving

Lou, Chiheng, Qi, Sheng, Kang, Rui, Zhang, Yong, Sun, Chen, Wang, Pengcheng, Liu, Bingyang, Liu, Xuanzhe, Jin, Xin

arXiv.org Artificial IntelligenceDec-11-2025

Deploying multiple models within shared GPU clusters is promising for improving resource efficiency in large language model (LLM) serving. Existing multi-LLM serving systems optimize GPU utilization at the cost of worse inference performance, especially time-to-first-token (TTFT). We identify the root cause of such compromise as their unawareness of future workload characteristics. In contrast, recent analysis on real-world traces has shown the high periodicity and long-term predictability of LLM serving workloads. We propose universal GPU workers to enable one-for-many GPU prewarming that loads models with knowledge of future workloads. Based on universal GPU workers, we design and build WarmServe, a multi-LLM serving system that (1) mitigates cluster-wide prewarming interference by adopting an evict-aware model placement strategy, (2) prepares universal GPU workers in advance by proactive prewarming, and (3) manages GPU memory with a zero-overhead memory switching mechanism. Evaluation under real-world datasets shows that WarmServe improves TTFT by up to 50.8$\times$ compared to the state-of-the-art autoscaling-based system, while being capable of serving up to 2.5$\times$ more requests compared to the GPU-sharing system.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2512.09472

Genre: Research Report (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SMART: A Surrogate Model for Predicting Application Runtime in Dragonfly Systems

Wang, Xin, Rizzini, Pietro Lodi, Medya, Sourav, Lan, Zhiling

arXiv.org Artificial IntelligenceNov-17-2025

The Dragonfly network, with its high-radix and low-diameter structure, is a leading interconnect in high-performance computing. A major challenge is workload interference on shared network links. Parallel discrete event simulation (PDES) is commonly used to analyze workload interference. However, high-fidelity PDES is computationally expensive, making it impractical for large-scale or real-time scenarios. Hybrid simulation that incorporates data-driven surrogate models offers a promising alternative, especially for forecasting application runtime, a task complicated by the dynamic behavior of network traffic. We present \ourmodel, a surrogate model that combines graph neural networks (GNNs) and large language models (LLMs) to capture both spatial and temporal patterns from port level router data. \ourmodel outperforms existing statistical and machine learning baselines, enabling accurate runtime prediction and supporting efficient hybrid simulation of Dragonfly networks.

iteration time, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.11111

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)

Add feedback

Deep Reinforcement Learning for Urban Air Quality Management: Multi-Objective Optimization of Pollution Mitigation Booth Placement in Metropolitan Environments

Rajesh, Kirtan, Kumar, Suvidha Rupesh

arXiv.org Artificial IntelligenceOct-8-2025

This is the preprint version of the article published in IEEE Access vol. 13, pp. 146503--146526, 2025, doi:10.1109/ACCESS.2025.3599541. Please cite the published version. Urban air pollution remains a pressing global concern, particularly in densely populated and traffic-intensive metropolitan areas like Delhi, where exposure to harmful pollutants severely impacts public health. Delhi, being one of the most polluted cities globally, experiences chronic air quality issues due to vehicular emissions, industrial activities, and construction dust, which exacerbate its already fragile atmospheric conditions. Traditional pollution mitigation strategies, such as static air purifying installations, often fail to maximize their impact due to suboptimal placement and limited adaptability to dynamic urban environments. This study presents a novel deep reinforcement learning (DRL) framework to optimize the placement of air purification booths to improve the air quality index (AQI) in the city of Delhi. We employ Proximal Policy Optimization (PPO), a state-of-the-art reinforcement learning algorithm, to iteratively learn and identify high-impact locations based on multiple spatial and environmental factors, including population density, traffic patterns, industrial influence, and green space constraints. Our approach is benchmarked against conventional placement strategies, including random and greedy AQI-based methods, using multi-dimensional performance evaluation metrics such as AQI improvement, spatial coverage, population and traffic impact, and spatial entropy.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2505.00668

Country: Asia > India (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Health & Medicine > Public Health (1.00)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

From Principles to Practice: A Systematic Study of LLM Serving on Multi-core NPUs

Zhu, Tianhao, Feng, Dahu, Feng, Erhu, Xia, Yubin

arXiv.org Artificial IntelligenceOct-8-2025

With the widespread adoption of Large Language Models (LLMs), the demand for high-performance LLM inference services continues to grow. To meet this demand, a growing number of AI accelerators have been proposed, such as Google TPU, Huawei NPU, Graphcore IPU, and Cerebras WSE, etc. Most of these accelerators adopt multi-core architectures to achieve enhanced scalability, but lack the flexibility of SIMT architectures. Therefore, without careful configuration of the hardware architecture, as well as deliberate design of tensor parallelism and core placement strategies, computational resources may be underutilized, resulting in suboptimal inference performance. To address these challenges, we first present a multi-level simulation framework with both transaction-level and performance-model-based simulation for multi-core NPUs. Using this simulator, we conduct a systematic analysis and further propose the optimal solutions for tensor parallelism strategies, core placement policies, memory management methods, as well as the selection between PD-disaggregation and PD-fusion on multi-core NPUs. We conduct comprehensive experiments on representative LLMs and various NPU configurations. The evaluation results demonstrate that, our solution can achieve 1.32x-6.03x speedup compared to SOTA designs for multi-core NPUs across different hardware configurations. As for LLM serving, our work offers guidance on designing optimal hardware architectures and serving strategies for multi-core NPUs across various LLM workloads.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.05632

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimal Sensor Placement Using Combinations of Hybrid Measurements for Source Localization

Tang, Kang, Xu, Sheng, Yang, Yuqi, Kong, He, Ma, Yongsheng

arXiv.org Artificial IntelligenceApr-10-2025

This paper focuses on static source localization employing different combinations of measurements, including time-difference-of-arrival (TDOA), received-signal-strength (RSS), angle-of-arrival (AOA), and time-of-arrival (TOA) measurements. Since sensor-source geometry significantly impacts localization accuracy, the strategies of optimal sensor placement are proposed systematically using combinations of hybrid measurements. Firstly, the relationship between sensor placement and source estimation accuracy is formulated by a derived Cramér-Rao bound (CRB). Secondly, the A-optimality criterion, i.e., minimizing the trace of the CRB, is selected to calculate the smallest reachable estimation mean-squared-error (MSE) in a unified manner. Thirdly, the optimal sensor placement strategies are developed to achieve the optimal estimation bound. Specifically, the specific constraints of the optimal geometries deduced by specific measurement, i.e., TDOA, AOA, RSS, and TOA, are found and discussed theoretically. Finally, the new findings are verified by simulation studies.

artificial intelligence, localization, sensor, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/RadarConf2458775.2024.10548509

2504.03769

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Communications > Networks > Sensor Networks (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Non-Overlapping Placement of Macro Cells based on Reinforcement Learning in Chip Design

Yu, Tao, Gao, Peng, Wang, Fei, Yuan, Ru-Yue

arXiv.org Artificial IntelligenceJul-30-2024

Due to the increasing complexity of chip design, existing placement methods still have many shortcomings in dealing with macro cells coverage and optimization efficiency. Aiming at the problems of layout overlap, inferior performance, and low optimization efficiency in existing chip design methods, this paper proposes an end-to-end placement method, SRLPlacer, based on reinforcement learning. First, the placement problem is transformed into a Markov decision process by establishing the coupling relationship graph model between macro cells to learn the strategy for optimizing layouts. Secondly, the whole placement process is optimized after integrating the standard cell layout. By assessing on the public benchmark ISPD2005, the proposed SRLPlacer can effectively solve the overlap problem between macro cells while considering routing congestion and shortening the total wire length to ensure routability.

layout, macro cell, standard cell, (17 more...)

arXiv.org Artificial Intelligence

2407.18499

Country:

Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Semiconductors & Electronics (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Bin Packing Optimization via Deep Reinforcement Learning

Wang, Baoying, Dong, Huixu

arXiv.org Artificial IntelligenceMar-19-2024

The Bin Packing Problem (BPP) has attracted enthusiastic research interest recently, owing to widespread applications in logistics and warehousing environments. It is truly essential to optimize the bin packing to enable more objects to be packed into boxes. Object packing order and placement strategy are the two crucial optimization objectives of the BPP. However, existing optimization methods for BPP, such as the genetic algorithm (GA), emerge as the main issues in highly computational cost and relatively low accuracy, making it difficult to implement in realistic scenarios. To well relieve the research gaps, we present a novel optimization methodology of two-dimensional (2D)-BPP and three-dimensional (3D)-BPP for objects with regular shapes via deep reinforcement learning (DRL), maximizing the space utilization and minimizing the usage number of boxes. First, an end-to-end DRL neural network constructed by a modified Pointer Network consisting of an encoder, a decoder and an attention module is proposed to achieve the optimal object packing order. Second, conforming to the top-down operation mode, the placement strategy based on a height map is used to arrange the ordered objects in the boxes, preventing the objects from colliding with boxes and other objects in boxes. Third, the reward and loss functions are defined as the indicators of the compactness, pyramid, and usage number of boxes to conduct the training of the DRL neural network based on an on-policy actor-critic framework. Finally, a series of experiments are implemented to compare our method with conventional packing methods, from which we conclude that our method outperforms these packing methods in both packing accuracy and efficiency.

algorithm, placement strategy, target position, (15 more...)

arXiv.org Artificial Intelligence

2403.1242

Country:

North America > United States > Michigan (0.04)
Europe > United Kingdom > Wales (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training

Xiao, Youshao, Wu, Weichang, Zhou, Zhenglei, Mao, Fagui, Zhao, Shangchun, Ju, Lin, Liang, Lei, Zhang, Xiaolu, Zhou, Jun

arXiv.org Artificial IntelligenceJan-24-2024

Recently, ChatGPT or InstructGPT like large language models (LLM) has made a significant impact in the AI world. Many works have attempted to reproduce the complex InstructGPT's training pipeline, namely Reinforcement Learning with Human Feedback (RLHF). However, the mainstream distributed RLHF training methods typically adopt a fixed model placement strategy, referred to as the Flattening strategy. This strategy treats all four interdependent models involved in RLHF as a single entity, distributing them across all devices and applying parallelism techniques designed for a single model, regardless of the different workloads inherent to each model. As a result, this strategy exacerbates the generation bottlenecks in the RLHF training and degrades the overall training efficiency. To address these issues, we propose an adaptive model placement framework that offers two flexible model placement strategies. The Interleaving strategy helps reduce memory redundancy and communication costs of RLHF training by placing models without dependencies on exclusive devices with careful orchestration. On the other hand, the Separation strategy improves the throughput of model training by separating the training and inference runtime of the RLHF pipeline with additional shadow models. Furthermore, our framework provides a simple user interface and allows for the agile allocation of models across devices in a fine-grained manner for various training scenarios, involving models of varying sizes and devices of different scales. Extensive experiments have demonstrated that our Interleaving and Separation strategies can achieve notable improvements up to 11X, compared to the current SOTA approaches. The results highlight the effectiveness and adaptability of our approaches in accelerating the training of distributed RLHF.

flattening strategy, placement strategy, reward model, (16 more...)

arXiv.org Artificial Intelligence

2312.11819

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

Lin, Zhiqi, Miao, Youshan, Xu, Guanbin, Li, Cheng, Saarikivi, Olli, Maleki, Saeed, Yang, Fan

arXiv.org Artificial IntelligenceNov-26-2023

Increasingly complex and diverse deep neural network (DNN) models necessitate distributing the execution across multiple devices for training and inference tasks, and also require carefully planned schedules for performance. However, existing practices often rely on predefined schedules that may not fully exploit the benefits of emerging diverse model-aware operator placement strategies. Handcrafting high-efficiency schedules can be challenging due to the large and varying schedule space. This paper presents Tessel, an automated system that searches for efficient schedules for distributed DNN training and inference for diverse operator placement strategies. To reduce search costs, Tessel leverages the insight that the most efficient schedules often exhibit repetitive pattern (repetend) across different data inputs. This leads to a two-phase approach: repetend construction and schedule completion. By exploring schedules for various operator placement strategies, Tessel significantly improves both training and inference performance. Experiments with representative DNN models demonstrate that Tessel achieves up to 5.5x training performance speedup and up to 38% inference latency reduction.

placement strategy, repetend, tessel, (14 more...)

arXiv.org Artificial Intelligence

2311.15269

Country:

Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Active Velocity Estimation using Light Curtains via Self-Supervised Multi-Armed Bandits

Ancha, Siddharth, Pathak, Gaurav, Zhang, Ji, Narasimhan, Srinivasa, Held, David

arXiv.org Artificial IntelligenceMay-29-2023

To navigate in an environment safely and autonomously, robots must accurately estimate where obstacles are and how they move. Instead of using expensive traditional 3D sensors, we explore the use of a much cheaper, faster, and higher resolution alternative: programmable light curtains. Light curtains are a controllable depth sensor that sense only along a surface that the user selects. We adapt a probabilistic method based on particle filters and occupancy grids to explicitly estimate the position and velocity of 3D points in the scene using partial measurements made by light curtains. The central challenge is to decide where to place the light curtain to accurately perform this task. We propose multiple curtain placement strategies guided by maximizing information gain and verifying predicted object locations. Then, we combine these strategies using an online learning framework. We propose a novel self-supervised reward function that evaluates the accuracy of current velocity estimates using future light curtain placements. We use a multi-armed bandit framework to intelligently switch between placement policies in real time, outperforming fixed policies. We develop a full-stack navigation system that uses position and velocity estimates from light curtains for downstream tasks such as localization, mapping, path-planning, and obstacle avoidance. This work paves the way for controllable light curtains to accurately, efficiently, and purposefully perceive and navigate complex and dynamic environments. Project website: https://siddancha.github.io/projects/active-velocity-estimation/

artificial intelligence, data mining, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2302.12597

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > California > Santa Clara County > San Jose (0.04)
Europe > Latvia > Riga Municipality > Riga (0.04)

Genre: Research Report (0.50)

Industry:

Transportation (0.46)
Energy (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(3 more...)

Add feedback